96 research outputs found

    CONTRAfold: RNA secondary structure prediction without physics-based models

    Get PDF
    doi:10.1093/bioinformatics/btl24

    Genome-wide analysis points to roles for extracellular matrix remodeling, the visual cycle, and neuronal development in myopia

    Get PDF
    Myopia, or nearsightedness, is the most common eye disorder, resulting primarily from excess elongation of the eye. The etiology of myopia, although known to be complex, is poorly understood. Here we report the largest ever genome-wide association study (43,360 participants) on myopia in Europeans. We performed a survival analysis on age of myopia onset and identified 19 significant associations (p < 5e-8), two of which are replications of earlier associations with refractive error. These 19 associations in total explain 2.7% of the variance in myopia age of onset, and point towards a number of different mechanisms behind the development of myopia. One association is in the gene PRSS56, which has previously been linked to abnormally small eyes; one is in a gene that forms part of the extracellular matrix (LAMA2); two are in or near genes involved in the regeneration of 11-cis-retinal (RGR and RDH5); two are near genes known to be involved in the growth and guidance of retinal ganglion cells (ZIC2, SFRP1); and five are in or near genes involved in neuronal signaling or development. These novel findings point towards multiple genetic factors involved in the development of myopia and suggest that complex interactions between extracellular matrix remodeling, neuronal development, and visual signals from the retina may underlie the development of myopia in humans

    CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction

    Get PDF
    CONTRAST is a gene predictor that directly incorporates information from multiple alignments and uses discriminative machine learning techniques to give large improvements in prediction over previous methods

    Efficient Replication of Over 180 Genetic Associations with Self-Reported Medical Data

    Get PDF
    While the cost and speed of generating genomic data have come down dramatically in recent years, the slow pace of collecting medical data for large cohorts continues to hamper genetic research. Here we evaluate a novel online framework for amassing large amounts of medical information in a recontactable cohort by assessing our ability to replicate genetic associations using these data. Using web-based questionnaires, we gathered self-reported data on 50 medical phenotypes from a generally unselected cohort of over 20,000 genotyped individuals. Of a list of genetic associations curated by NHGRI, we successfully replicated about 75% of the associations that we expected to (based on the number of cases in our cohort and reported odds ratios, and excluding a set of associations with contradictory published evidence). Altogether we replicated over 180 previously reported associations, including many for type 2 diabetes, prostate cancer, cholesterol levels, and multiple sclerosis. We found significant variation across categories of conditions in the percentage of expected associations that we were able to replicate, which may reflect systematic inflation of the effects in some initial reports, or differences across diseases in the likelihood of misdiagnosis or misreport. We also demonstrated that we could improve replication success by taking advantage of our recontactable cohort, offering more in-depth questions to refine self-reported diagnoses. Our data suggests that online collection of self-reported data in a recontactable cohort may be a viable method for both broad and deep phenotyping in large populations

    Web-Based Genome-Wide Association Study Identifies Two Novel Loci and a Substantial Genetic Component for Parkinson's Disease

    Get PDF
    Although the causes of Parkinson's disease (PD) are thought to be primarily environmental, recent studies suggest that a number of genes influence susceptibility. Using targeted case recruitment and online survey instruments, we conducted the largest case-control genome-wide association study (GWAS) of PD based on a single collection of individuals to date (3,426 cases and 29,624 controls). We discovered two novel, genome-wide significant associations with PD–rs6812193 near SCARB2 (, ) and rs11868035 near SREBF1/RAI1 (, )—both replicated in an independent cohort. We also replicated 20 previously discovered genetic associations (including LRRK2, GBA, SNCA, MAPT, GAK, and the HLA region), providing support for our novel study design. Relying on a recently proposed method based on genome-wide sharing estimates between distantly related individuals, we estimated the heritability of PD to be at least 0.27. Finally, using sparse regression techniques, we constructed predictive models that account for 6%–7% of the total variance in liability and that suggest the presence of true associations just beyond genome-wide significance, as confirmed through both internal and external cross-validation. These results indicate a substantial, but by no means total, contribution of genetics underlying susceptibility to both early-onset and late-onset PD, suggesting that, despite the novel associations discovered here and elsewhere, the majority of the genetic component for Parkinson's disease remains to be discovered

    Multiple alignment of protein sequences with repeats and rearrangements

    Get PDF
    Multiple sequence alignments are the usual starting point for analyses of protein structure and evolution. For proteins with repeated, shuffled and missing domains, however, traditional multiple sequence alignment algorithms fail to provide an accurate view of homology between related proteins, because they either assume that the input sequences are globally alignable or require locally alignable regions to appear in the same order in all sequences. In this paper, we present ProDA, a novel system for automated detection and alignment of homologous regions in collections of proteins with arbitrary domain architectures. Given an input set of unaligned sequences, ProDA identifies all homologous regions appearing in one or more sequences, and returns a collection of local multiple alignments for these regions. On a subset of the BAliBASE benchmarking suite containing curated alignments of proteins with complicated domain architectures, ProDA performs well in detecting conserved domain boundaries and clustering domain segments, achieving the highest accuracy to date for this task. We conclude that ProDA is a practical tool for automated alignment of protein sequences with repeats and rearrangements in their domain architecture

    Novel associations for hypothyroidism include known autoimmune risk loci

    Get PDF
    Hypothyroidism is the most common thyroid disorder, affecting about 5% of the general population. Here we present the first large genome-wide association study of hypothyroidism, in 2,564 cases and 24,448 controls from the customer base of 23andMe, Inc., a personal genetics company. We identify four genome-wide significant associations, two of which are well known to be involved with a large spectrum of autoimmune diseases: rs6679677 near _PTPN22_ and rs3184504 in _SH2B3_ (p-values 3.5e-13 and 3.0e-11, respectively). We also report associations with rs4915077 near _VAV3_ (p-value 8.3e-11), another gene involved in immune function, and rs965513 near _FOXE1_ (p-value 3.1e-14). Of these, the association with _PTPN22_ confirms a recent small candidate gene study, and _FOXE1_ was previously known to be associated with thyroid-stimulating hormone (TSH) levels. Although _SH2B3_ has been previously linked with a number of autoimmune diseases, this is the first report of its association with thyroid disease. The _VAV3_ association is novel. These results suggest heterogeneity in the genetic etiology of hypothyroidism, implicating genes involved in both autoimmune disorders and thyroid function. Using a genetic risk profile score based on the top association from each of the four genome-wide significant regions in our study, the relative risk between the highest and lowest deciles of genetic risk is 2.1

    Clearance kinetics and matrix binding partners of the receptor for advanced glycation end products

    Get PDF
    Elucidating the sites and mechanisms of sRAGE action in the healthy state is vital to better understand the biological importance of the receptor for advanced glycation end products (RAGE). Previous studies in animal models of disease have demonstrated that exogenous sRAGE has an anti-inflammatory effect, which has been reasoned to arise from sequestration of pro-inflammatory ligands away from membrane-bound RAGE isoforms. We show here that sRAGE exhibits in vitro binding with high affinity and reversibly to extracellular matrix components collagen I, collagen IV, and laminin. Soluble RAGE administered intratracheally, intravenously, or intraperitoneally, does not distribute in a specific fashion to any healthy mouse tissue, suggesting against the existence of accessible sRAGE sinks and receptors in the healthy mouse. Intratracheal administration is the only effective means of delivering exogenous sRAGE to the lung, the organ in which RAGE is most highly expressed; clearance of sRAGE from lung does not differ appreciably from that of albumin. Copyright: © 2014 Milutinovic et al
    corecore